19 research outputs found

    Author Correction: Federated learning enables big data for rare cancer boundary detection.

    Get PDF
    10.1038/s41467-023-36188-7NATURE COMMUNICATIONS14

    Federated learning enables big data for rare cancer boundary detection.

    Get PDF
    Although machine learning (ML) has shown promise across disciplines, out-of-sample generalizability is concerning. This is currently addressed by sharing multi-site data, but such centralization is challenging/infeasible to scale due to various limitations. Federated ML (FL) provides an alternative paradigm for accurate and generalizable ML, by only sharing numerical model updates. Here we present the largest FL study to-date, involving data from 71 sites across 6 continents, to generate an automatic tumor boundary detector for the rare disease of glioblastoma, reporting the largest such dataset in the literature (n = 6, 314). We demonstrate a 33% delineation improvement for the surgically targetable tumor, and 23% for the complete tumor extent, over a publicly trained model. We anticipate our study to: 1) enable more healthcare studies informed by large diverse data, ensuring meaningful results for rare diseases and underrepresented populations, 2) facilitate further analyses for glioblastoma by releasing our consensus model, and 3) demonstrate the FL effectiveness at such scale and task-complexity as a paradigm shift for multi-site collaborations, alleviating the need for data-sharing

    Federated Learning Enables Big Data for Rare Cancer Boundary Detection

    Get PDF
    Although machine learning (ML) has shown promise across disciplines, out-of-sample generalizability is concerning. This is currently addressed by sharing multi-site data, but such centralization is challenging/infeasible to scale due to various limitations. Federated ML (FL) provides an alternative paradigm for accurate and generalizable ML, by only sharing numerical model updates. Here we present the largest FL study to-date, involving data from 71 sites across 6 continents, to generate an automatic tumor boundary detector for the rare disease of glioblastoma, reporting the largest such dataset in the literature (n = 6, 314). We demonstrate a 33% delineation improvement for the surgically targetable tumor, and 23% for the complete tumor extent, over a publicly trained model. We anticipate our study to: 1) enable more healthcare studies informed by large diverse data, ensuring meaningful results for rare diseases and underrepresented populations, 2) facilitate further analyses for glioblastoma by releasing our consensus model, and 3) demonstrate the FL effectiveness at such scale and task-complexity as a paradigm shift for multi-site collaborations, alleviating the need for data-sharing

    Efficient Metadata Management for Cloud Computing applications

    Get PDF
    Cloud computing applications require a scalable, elastic and fault tolerant storage system. In this paper, we describe how metadata management can be improved for a file system built for large scale data-intensive applications. We imple- ment Ring File System (RFS), that uses a single hop Dis- tributed Hash Table, found in peer-to-peer systems, to man- age its metadata and a traditional client server model for managing the actual data. Our solution does not have a single point of failure, since the metadata is replicated and the num- ber of files that can be stored and the throughput of meta- data operations scales linearly with the number of servers. We compare against two open source implementations of Google File System (GFS): HDFS and KFS and show that our prototype performs better in terms of fault tolerance, scalability and throughput.published or submitted for publicationnot peer reviewe

    Accurate Sequence Alignment using Distributed Filtering on GPU Clusters

    Get PDF
    Advent of next generation gene sequencing machines has led to computationally intensive alignment problems that can take many hours on a modern computer. Considering the fast increasing rate of introduction of new short sequences that are sequenced, the large number of existing sequences and inaccuracies in the sequencing machines, short sequence alignment has become a major challenge in High Performance Computing. In practice gaps as well as mismatches are found in genomic sequences, resulting in an edit distance problem. In this paper we describe the design of a distributed filter, based on shifted masks, to quickly reduce the number of potential matches in the presence of gaps and mismatches. Furthermore, we present a hybrid dynamic programming method, optimized for GPGPU targets, to process the filter outputs and find the accurate number of insertions, deletions and mismatches. Finally we present results from experiments performed on an NCSA cluster of 128 GPU units using the Hadoop framework.unpublishednot peer reviewe

    Veno-Arterial Extracorporeal Membrane Oxygenation in Patients with Fulminant Myocarditis: A Review of Contemporary Literature

    No full text
    Fulminant myocarditis is characterized by life threatening heart failure presenting as cardiogenic shock requiring inotropic or mechanical circulatory support to maintain tissue perfusion. There are limited data on the role of veno-arterial extracorporeal membrane oxygenation (VA-ECMO) in the management of fulminant myocarditis. This review seeks to evaluate the management of fulminant myocarditis with a special emphasis on the role and outcomes with VA-ECMO use
    corecore